Gaussian process modelling of multiple short time series
نویسندگان
چکیده
We present techniques for effective Gaussian process (GP) modelling of multiple short time series. These problems are common when applying GP models independently to each gene in a gene expression time series data set. Such sets typically contain very few time points. Naive application of common GP modelling techniques can lead to severe over-fitting or under-fitting in a significant fraction of the fitted models, depending on the details of the data set. We propose avoiding over-fitting by constraining the GP length-scale to values that focus most of the energy spectrum to frequencies below the Nyquist frequency corresponding to the sampling frequency in the data set. Under-fitting can be avoided by more informative priors on observation noise. Combining these methods allows applying GP methods reliably automatically to large numbers of independent instances of short time series. This is illustrated with experiments with both synthetic data and real gene expression data.
منابع مشابه
Flexible temporal expression profile modelling using the Gaussian process
Motivation: Time course gene expression experiments have proved valuable in a variety of biological studies (e.g., Chuang et al., 2002; Edwards et al., 2003). A general goal common to many of these time course experiments is to identify genes that exhibit different temporal expression profiles across multiple biological conditions. Such experiments are, however, often hampered by the lack of da...
متن کاملTREND-CYCLE ESTIMATION USING FUZZY TRANSFORM OF HIGHER DEGREE
In this paper, we provide theoretical justification for the application of higher degree fuzzy transform in time series analysis. Under the assumption that a time series can be additively decomposed into a trend-cycle, a seasonal component and a random noise, we demonstrate that the higher degree fuzzy transform technique can be used for the estimation of the trend-cycle, which is one of the ba...
متن کاملGaussian processes for time-series modelling.
In this paper, we offer a gentle introduction to Gaussian processes for time-series data analysis. The conceptual framework of Bayesian modelling for time-series data is discussed and the foundations of Bayesian non-parametric modelling presented for Gaussian processes. We discuss how domain knowledge influences design of the Gaussian process models and provide case examples to highlight the ap...
متن کاملGaussian Mixture Models for Time Series Modelling, Forecasting, and Interpolation
Gaussian mixture models provide an appealing tool for time series modelling. By embedding the time series to a higher-dimensional space, the density of the points can be estimated by a mixture model. The model can directly be used for short-to-medium term forecasting and missing value imputation. The modelling setup introduces some restrictions on the mixture model, which when appropriately tak...
متن کاملSemiparametric Bootstrap Prediction Intervals in time Series
One of the main goals of studying the time series is estimation of prediction interval based on an observed sample path of the process. In recent years, different semiparametric bootstrap methods have been proposed to find the prediction intervals without any assumption of error distribution. In semiparametric bootstrap methods, a linear process is approximated by an autoregressive process. The...
متن کامل